Search | WHO COVID-19 Research Database

1.

Tracking the COVID-19 outbreak in India through Twitter: Opportunities for social media based global pandemic surveillance.

Lakamana, Sahithi; Yang, Yuan-Chi; Al-Garadi, Mohammed Ali; Sarker, Abeed.

AMIA Annu Symp Proc ; 2022: 313-322, 2022.

Article in English | MEDLINE | ID: covidwho-20238373

ABSTRACT

We investigated the utility of Twitter for conducting multi-faceted geolocation-centric pandemic surveillance, using India as an example. We collected over 4 million COVID19-related tweets related to the Indian outbreak between January and July 2021. We geolocated the tweets, applied natural language processing to characterize the tweets (eg., identifying symptoms and emotions), and compared tweet volumes with the numbers of confirmed COVID-19 cases. Tweet numbers closely mirrored the outbreak, with the 7-day average strongly correlated with confirmed COVID-19 cases nationally (Spearman r=0.944; p=0.001), and also at the state level (Spearman r=0.84, p=0.0003). Fatigue, Dyspnea and Cough were the top symptoms detected, while there was a significant increase in the proportion of tweets expressing negative emotions (eg., fear and sadness). The surge in COVID-19 tweets was followed by increased number of posts expressing concern about black fungus and oxygen supply. Our study illustrates the potential of social media for multi-faceted pandemic surveillance.

Subject(s)

COVID-19 , Social Media , COVID-19/epidemiology , Disease Outbreaks , Humans , Natural Language Processing , Pandemics

2.

Characteristics of Intimate Partner Violence and Survivor's Needs During the COVID-19 Pandemic: Insights From Subreddits Related to Intimate Partner Violence.

Kim, Sangmi; Warren, Elise; Jahangir, Tasfia; Al-Garadi, Mohammed; Guo, Yuting; Yang, Yuan-Chi; Lakamana, Sahithi; Sarker, Abeed.

J Interpers Violence ; : 8862605231168816, 2023 Apr 27.

Article in English | MEDLINE | ID: covidwho-2297882

ABSTRACT

Intimate partner violence (IPV) increased during the COVID-19 pandemic. Collecting actionable IPV-related data from conventional sources (e.g., medical records) was challenging during the pandemic, generating a need to obtain relevant data from non-conventional sources, such as social media. Social media, like Reddit, is a preferred medium of communication for IPV survivors to share their experiences and seek support with protected anonymity. Nevertheless, the scope of available IPV-related data on social media is rarely documented. Thus, we examined the availability of IPV-related information on Reddit and the characteristics of the reported IPV during the pandemic. Using natural language processing, we collected publicly available Reddit data from four IPV-related subreddits between January 1, 2020 and March 31, 2021. Of 4,000 collected posts, we randomly sampled 300 posts for analysis. Three individuals on the research team independently coded the data and resolved the coding discrepancies through discussions. We adopted quantitative content analysis and calculated the frequency of the identified codes. 36% of the posts (n = 108) constituted self-reported IPV by survivors, of which 40% regarded current/ongoing IPV, and 14% contained help-seeking messages. A majority of the survivors' posts reflected psychological aggression, followed by physical violence. Notably, 61.4% of the psychological aggression involved expressive aggression, followed by gaslighting (54.3%) and coercive control (44.3%). Survivors' top three needs during the pandemic were hearing similar experiences, legal advice, and validating their feelings/reactions/thoughts/actions. Albeit limited, data from bystanders (survivors' friends, family, or neighbors) were also available. Rich data reflecting IPV survivors' lived experiences were available on Reddit. Such information will be useful for IPV surveillance, prevention, and intervention.

3.

The Early Detection of Fraudulent COVID-19 Products From Twitter Chatter: Data Set and Baseline Approach Using Anomaly Detection.

Sarker, Abeed; Lakamana, Sahithi; Liao, Ruqi; Abbas, Aamir; Yang, Yuan-Chi; Al-Garadi, Mohammed.

JMIR Infodemiology ; 3: e43694, 2023.

Article in English | MEDLINE | ID: covidwho-2303135

ABSTRACT

Background: Social media has served as a lucrative platform for spreading misinformation and for promoting fraudulent products for the treatment, testing, and prevention of COVID-19. This has resulted in the issuance of many warning letters by the US Food and Drug Administration (FDA). While social media continues to serve as the primary platform for the promotion of such fraudulent products, it also presents the opportunity to identify these products early by using effective social media mining methods. Objective: Our objectives were to (1) create a data set of fraudulent COVID-19 products that can be used for future research and (2) propose a method using data from Twitter for automatically detecting heavily promoted COVID-19 products early. Methods: We created a data set from FDA-issued warnings during the early months of the COVID-19 pandemic. We used natural language processing and time-series anomaly detection methods for automatically detecting fraudulent COVID-19 products early from Twitter. Our approach is based on the intuition that increases in the popularity of fraudulent products lead to corresponding anomalous increases in the volume of chatter regarding them. We compared the anomaly signal generation date for each product with the corresponding FDA letter issuance date. We also performed a brief manual analysis of chatter associated with 2 products to characterize their contents. Results: FDA warning issue dates ranged from March 6, 2020, to June 22, 2021, and 44 key phrases representing fraudulent products were included. From 577,872,350 posts made between February 19 and December 31, 2020, which are all publicly available, our unsupervised approach detected 34 out of 44 (77.3%) signals about fraudulent products earlier than the FDA letter issuance dates, and an additional 6 (13.6%) within a week following the corresponding FDA letters. Content analysis revealed misinformation, information, political, and conspiracy theories to be prominent topics. Conclusions: Our proposed method is simple, effective, easy to deploy, and does not require high-performance computing machinery unlike deep neural network-based methods. The method can be easily extended to other types of signal detection from social media data. The data set may be used for future research and the development of more advanced methods.

4.

Automatic Detection of Twitter Users Who Express Chronic Stress Experiences via Supervised Machine Learning and Natural Language Processing.

Yang, Yuan-Chi; Xie, Angel; Kim, Sangmi; Hair, Jessica; Al-Garadi, Mohammed; Sarker, Abeed.

Comput Inform Nurs ; 2022 Nov 28.

Article in English | MEDLINE | ID: covidwho-2135636

ABSTRACT

Americans bear a high chronic stress burden, particularly during the COVID-19 pandemic. Although social media have many strengths to complement the weaknesses of conventional stress measures, including surveys, they have been rarely utilized to detect individuals self-reporting chronic stress. Thus, this study aimed to develop and evaluate an automatic system on Twitter to identify users who have self-reported chronic stress experiences. Using the Twitter public streaming application programming interface, we collected tweets containing certain stress-related keywords (eg, "chronic," "constant," "stress") and then filtered the data using pre-defined text patterns. We manually annotated tweets with (without) self-report of chronic stress as positive (negative). We trained multiple classifiers and tested them via accuracy and F1 score. We annotated 4195 tweets (1560 positives, 2635 negatives), achieving an inter-annotator agreement of 0.83 (Cohen's kappa). The classifier based on Bidirectional Encoder Representation from Transformers performed the best (accuracy of 83.6% [81.0-86.1]), outperforming the second best-performing classifier (support vector machines: 76.4% [73.5-79.3]). The past tweets from the authors of positive tweets contained useful information, including sources and health impacts of chronic stress. Our study demonstrates that users' self-reported chronic stress experiences can be automatically identified on Twitter, which has a high potential for surveillance and large-scale intervention.

5.

The Role of Natural Language Processing during the COVID-19 Pandemic: Health Applications, Opportunities, and Challenges.

Al-Garadi, Mohammed Ali; Yang, Yuan-Chi; Sarker, Abeed.

Healthcare (Basel) ; 10(11)2022 Nov 12.

Article in English | MEDLINE | ID: covidwho-2110008

ABSTRACT

The COVID-19 pandemic is the most devastating public health crisis in at least a century and has affected the lives of billions of people worldwide in unprecedented ways. Compared to pandemics of this scale in the past, societies are now equipped with advanced technologies that can mitigate the impacts of pandemics if utilized appropriately. However, opportunities are currently not fully utilized, particularly at the intersection of data science and health. Health-related big data and technological advances have the potential to significantly aid the fight against such pandemics, including the current pandemic's ongoing and long-term impacts. Specifically, the field of natural language processing (NLP) has enormous potential at a time when vast amounts of text-based data are continuously generated from a multitude of sources, such as health/hospital systems, published medical literature, and social media. Effectively mitigating the impacts of the pandemic requires tackling challenges associated with the application and deployment of NLP systems. In this paper, we review the applications of NLP to address diverse aspects of the COVID-19 pandemic. We outline key NLP-related advances on a chosen set of topics reported in the literature and discuss the opportunities and challenges associated with applying NLP during the current pandemic and future ones. These opportunities and challenges can guide future research aimed at improving the current health and social response systems and pandemic preparedness.

6.

Tracking the COVID-19 outbreak in India through Twitter: Opportunities for social media based global pandemic surveillance

Lakamana, Sahithi, Yang, Yuan-Chi, Al-Garadi, Mohammed Ali, Sarker, Abeed.

AMIA ... Annual Symposium proceedings. AMIA Symposium ; 2022:313-322, 2022.

Article in English | EuropePMC | ID: covidwho-1940078

ABSTRACT

We investigated the utility of Twitter for conducting multi-faceted geolocation-centric pandemic surveillance, using India as an example. We collected over 4 million COVID19-related tweets related to the Indian outbreak between January and July 2021. We geolocated the tweets, applied natural language processing to characterize the tweets (eg., identifying symptoms and emotions), and compared tweet volumes with the numbers of confirmed COVID-19 cases. Tweet numbers closely mirrored the outbreak, with the 7-day average strongly correlated with confirmed COVID-19 cases nationally (Spearman r=0.944;p=0.001), and also at the state level (Spearman r=0.84, p=0.0003). Fatigue, Dyspnea and Cough were the top symptoms detected, while there was a significant increase in the proportion of tweets expressing negative emotions (eg., fear and sadness). The surge in COVID-19 tweets was followed by increased number of posts expressing concern about black fungus and oxygen supply. Our study illustrates the potential of social media for multi-faceted pandemic surveillance.

7.

A Light-Weight Text Summarization System for Fast Access to Medical Evidence.

Sarker, Abeed; Yang, Yuan-Chi; Al-Garadi, Mohammed Ali; Abbas, Aamir.

Front Digit Health ; 2: 585559, 2020.

Article in English | MEDLINE | ID: covidwho-1497037

ABSTRACT

As the volume of published medical research continues to grow rapidly, staying up-to-date with the best-available research evidence regarding specific topics is becoming an increasingly challenging problem for medical experts and researchers. The current COVID19 pandemic is a good example of a topic on which research evidence is rapidly evolving. Automatic query-focused text summarization approaches may help researchers to swiftly review research evidence by presenting salient and query-relevant information from newly-published articles in a condensed manner. Typical medical text summarization approaches require domain knowledge, and the performances of such systems rely on resource-heavy medical domain-specific knowledge sources and pre-processing methods (e.g., text classification) for deriving semantic information. Consequently, these systems are often difficult to speedily customize, extend, or deploy in low-resource settings, and they are often operationally slow. In this paper, we propose a fast and simple extractive summarization approach that can be easily deployed and run, and may thus aid medical experts and researchers obtain fast access to the latest research evidence. At runtime, our system utilizes similarity measurements derived from pre-trained medical domain-specific word embeddings in addition to simple features, rather than computationally-expensive pre-processing and resource-heavy knowledge bases. Automatic evaluation using ROUGE-a summary evaluation tool-on a public dataset for evidence-based medicine shows that our system's performance, despite the simple implementation, is statistically comparable with the state-of-the-art. Extrinsic manual evaluation based on recently-released COVID19 articles demonstrates that the summarizer performance is close to human agreement, which is generally low, for extractive summarization.

8.

Self-reported COVID-19 symptoms on Twitter: an analysis and a research resource.

Sarker, Abeed; Lakamana, Sahithi; Hogg-Bremer, Whitney; Xie, Angel; Al-Garadi, Mohammed Ali; Yang, Yuan-Chi.

J Am Med Inform Assoc ; 27(8): 1310-1315, 2020 08 01.

Article in English | MEDLINE | ID: covidwho-632174

ABSTRACT

OBJECTIVE: To mine Twitter and quantitatively analyze COVID-19 symptoms self-reported by users, compare symptom distributions across studies, and create a symptom lexicon for future research. MATERIALS AND METHODS: We retrieved tweets using COVID-19-related keywords, and performed semiautomatic filtering to curate self-reports of positive-tested users. We extracted COVID-19-related symptoms mentioned by the users, mapped them to standard concept IDs in the Unified Medical Language System, and compared the distributions to those reported in early studies from clinical settings. RESULTS: We identified 203 positive-tested users who reported 1002 symptoms using 668 unique expressions. The most frequently-reported symptoms were fever/pyrexia (66.1%), cough (57.9%), body ache/pain (42.7%), fatigue (42.1%), headache (37.4%), and dyspnea (36.3%) amongst users who reported at least 1 symptom. Mild symptoms, such as anosmia (28.7%) and ageusia (28.1%), were frequently reported on Twitter, but not in clinical studies. CONCLUSION: The spectrum of COVID-19 symptoms identified from Twitter may complement those identified in clinical settings.

Subject(s)

Coronavirus Infections , Pandemics , Pneumonia, Viral , Self Report , Social Media , Symptom Assessment , Betacoronavirus , COVID-19 , Coronavirus Infections/complications , Coronavirus Infections/diagnosis , Data Mining , Humans , Natural Language Processing , Pneumonia, Viral/complications , Pneumonia, Viral/diagnosis , SARS-CoV-2

ABSTRACT

Subject(s)

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

ABSTRACT

Subject(s)

SEND TO:

SELECTION OF CITATIONS

SEARCH DETAIL